智能论文笔记

Adaptive Fine-Grained Sketch-Based Image Retrieval

Ayan Kumar Bhunia , Aneeshan Sain , Parth Shah , Animesh Gupta , Pinaki Nath Chowdhury , Tao Xiang , Yi-Zhe Song

分类：计算机视觉

2022-07-04

最近对基于细粒的基于草图的图像检索（FG-SBIR）的重点已转向将模型概括为新类别，而没有任何培训数据。但是，在现实世界中，经过训练的FG-SBIR模型通常应用于新类别和不同的人类素描器，即不同的绘图样式。尽管这使概括问题复杂化，但幸运的是，通常可以使用一些示例，从而使模型适应新的类别/样式。在本文中，我们提供了一种新颖的视角 - 我们没有要求使用概括的模型，而是提倡快速适应的模型，在测试过程中只有很少的样本（以几种方式）。为了解决这个新问题，我们介绍了一种基于几个关键修改的基于新型的模型 - 静态元学习（MAML）框架：（1）作为基于边缘的对比度损失的检索任务，我们简化了内部循环中的MAML训练使其更稳定和易于处理。（2）我们的对比度损失的边距也通过其余模型进行了元学习。（3）在外循环中引入了另外三个正规化损失，以使元学习的FG-SBIR模型对类别/样式适应更有效。在公共数据集上进行的广泛实验表明，基于概括和基于零射的方法的增益很大，还有一些强大的射击基线。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译

Federated Learning with Client-Exclusive Classes

Jiayun Zhang , Xiyuan Zhang , Xinyang Zhang , Dezhi Hong , Rajesh K. Gupta , Jingbo Shang

分类：机器学习

2023-01-01

Existing federated classification algorithms typically assume the local annotations at every client cover the same set of classes. In this paper, we aim to lift such an assumption and focus on a more general yet practical non-IID setting where every client can work on non-identical and even disjoint sets of classes (i.e., client-exclusive classes), and the clients have a common goal which is to build a global classification model to identify the union of these classes. Such heterogeneity in client class sets poses a new challenge: how to ensure different clients are operating in the same latent space so as to avoid the drift after aggregation? We observe that the classes can be described in natural languages (i.e., class names) and these names are typically safe to share with all parties. Thus, we formulate the classification problem as a matching process between data representations and class representations and break the classification model into a data encoder and a label encoder. We leverage the natural-language class names as the common ground to anchor the class representations in the label encoder. In each iteration, the label encoder updates the class representations and regulates the data representations through matching. We further use the updated class representations at each round to annotate data samples for locally-unaware classes according to similarity and distill knowledge to local models. Extensive experiments on four real-world datasets show that the proposed method can outperform various classical and state-of-the-art federated learning methods designed for learning with non-IID data.

translated by 谷歌翻译

A Functional approach for Two Way Dimension Reduction in Time Series

Aniruddha Rajendra Rao , Haiyan Wang , Chetan Gupta

分类：机器学习

2023-01-01

The rise in data has led to the need for dimension reduction techniques, especially in the area of non-scalar variables, including time series, natural language processing, and computer vision. In this paper, we specifically investigate dimension reduction for time series through functional data analysis. Current methods for dimension reduction in functional data are functional principal component analysis and functional autoencoders, which are limited to linear mappings or scalar representations for the time series, which is inefficient. In real data applications, the nature of the data is much more complex. We propose a non-linear function-on-function approach, which consists of a functional encoder and a functional decoder, that uses continuous hidden layers consisting of continuous neurons to learn the structure inherent in functional data, which addresses the aforementioned concerns in the existing approaches. Our approach gives a low dimension latent representation by reducing the number of functional features as well as the timepoints at which the functions are observed. The effectiveness of the proposed model is demonstrated through multiple simulations and real data examples.

translated by 谷歌翻译

Landing a UAV in Harsh Winds and Turbulent Open Waters

Parakh M. Gupta , Eric Pairet , Tiago Nascimento , Martin Saska

分类：机器人

2022-12-31

Landing an unmanned aerial vehicle unmanned aerial vehicle (UAV) on top of an unmanned surface vehicle (USV) in harsh open waters is a challenging problem, owing to forces that can damage the UAV due to a severe roll and/or pitch angle of the USV during touchdown. To tackle this, we propose a novel model predictive control (MPC) approach enabling a UAV to land autonomously on a USV in these harsh conditions. The MPC employs a novel objective function and an online decomposition of the oscillatory motion of the vessel to predict, attempt, and accomplish the landing during near-zero tilt of the landing platform. The nonlinear prediction of the motion of the vessel is performed using visual data from an onboard camera. Therefore, the system does not require any communication with the USV or a control station. The proposed method was analyzed in numerous robotics simulations in harsh and extreme conditions and further validated in various real-world scenarios.

translated by 谷歌翻译

Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents

Sayar Ghosh Roy , Anshul Padhi , Risubh Jain , Manish Gupta , Vasudeva Varma

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-31

Multiple studies have focused on predicting the prospective popularity of an online document as a whole, without paying attention to the contributions of its individual parts. We introduce the task of proactively forecasting popularities of sentences within online news documents solely utilizing their natural language content. We model sentence-specific popularity forecasting as a sequence regression task. For training our models, we curate InfoPop, the first dataset containing popularity labels for over 1.7 million sentences from over 50,000 online news documents. To the best of our knowledge, this is the first dataset automatically created using streams of incoming search engine queries to generate sentence-level popularity annotations. We propose a novel transfer learning approach involving sentence salience prediction as an auxiliary task. Our proposed technique coupled with a BERT-based neural model exceeds nDCG values of 0.8 for proactive sentence-specific popularity forecasting. Notably, our study presents a non-trivial takeaway: though popularity and salience are different concepts, transfer learning from salience prediction enhances popularity forecasting. We release InfoPop and make our code publicly available: https://github.com/sayarghoshroy/InfoPopularity

translated by 谷歌翻译

Self-Activating Neural Ensembles for Continual Reinforcement Learning

Sam Powers , Eliot Xing , Abhinav Gupta

分类：机器学习 | 人工智能

2022-12-31

The ability for an agent to continuously learn new skills without catastrophically forgetting existing knowledge is of critical importance for the development of generally intelligent agents. Most methods devised to address this problem depend heavily on well-defined task boundaries, and thus depend on human supervision. Our task-agnostic method, Self-Activating Neural Ensembles (SANE), uses a modular architecture designed to avoid catastrophic forgetting without making any such assumptions. At the beginning of each trajectory, a module in the SANE ensemble is activated to determine the agent's next policy. During training, new modules are created as needed and only activated modules are updated to ensure that unused modules remain unchanged. This system enables our method to retain and leverage old skills, while growing and learning new ones. We demonstrate our approach on visually rich procedurally generated environments.

translated by 谷歌翻译

Hybrid Deep Reinforcement Learning and Planning for Safe and Comfortable Automated Driving

Dikshant Gupta , Mathias Klusch

分类：机器人 | 人工智能

2022-12-30

We present a novel hybrid learning method, HyLEAR, for solving the collision-free navigation problem for self-driving cars in POMDPs. HyLEAR leverages interposed learning to embed knowledge of a hybrid planner into a deep reinforcement learner to faster determine safe and comfortable driving policies. In particular, the hybrid planner combines pedestrian path prediction and risk-aware path planning with driving-behavior rule-based reasoning such that the driving policies also take into account, whenever possible, the ride comfort and a given set of driving-behavior rules. Our experimental performance analysis over the CARLA-CTS1 benchmark of critical traffic scenarios revealed that HyLEAR can significantly outperform the selected baselines in terms of safety and ride comfort.

translated by 谷歌翻译

Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning

Colorado J. Reed , Ritwik Gupta , Shufan Li , Sarah Brockman , Christopher Funk , Brian Clipp , Christopher Funk , Salvatore Candido , Matt Uyttendaele , Trevor Darrell

分类：计算机视觉

2022-12-30

Remote sensing imagery provides comprehensive views of the Earth, where different sensors collect complementary data at different spatial scales. Large, pretrained models are commonly finetuned with imagery that is heavily augmented to mimic different conditions and scales, with the resulting models used for various tasks with imagery from a range of spatial scales. Such models overlook scale-specific information in the data. In this paper, we present Scale-MAE, a pretraining method that explicitly learns relationships between data at different, known scales throughout the pretraining process. Scale-MAE pretrains a network by masking an input image at a known input scale, where the area of the Earth covered by the image determines the scale of the ViT positional encoding, not the image resolution. Scale-MAE encodes the masked image with a standard ViT backbone, and then decodes the masked image through a bandpass filter to reconstruct low/high frequency images at lower/higher scales. We find that tasking the network with reconstructing both low/high frequency images leads to robust multiscale representations for remote sensing imagery. Scale-MAE achieves an average of a $5.0\%$ non-parametric kNN classification improvement across eight remote sensing datasets compared to current state-of-the-art and obtains a $0.9$ mIoU to $3.8$ mIoU improvement on the SpaceNet building segmentation transfer task for a range of evaluation scales.

translated by 谷歌翻译

Examining Political Rhetoric with Epistemic Stance Detection

Ankita Gupta , Su Lin Blodgett , Justin H Gross , Brendan O'Connor

分类：自然语言处理

2022-12-29

Participants in political discourse employ rhetorical strategies -- such as hedging, attributions, or denials -- to display varying degrees of belief commitments to claims proposed by themselves or others. Traditionally, political scientists have studied these epistemic phenomena through labor-intensive manual content analysis. We propose to help automate such work through epistemic stance prediction, drawn from research in computational semantics, to distinguish at the clausal level what is asserted, denied, or only ambivalently suggested by the author or other mentioned entities (belief holders). We first develop a simple RoBERTa-based model for multi-source stance predictions that outperforms more complex state-of-the-art modeling. Then we demonstrate its novel application to political science by conducting a large-scale analysis of the Mass Market Manifestos corpus of U.S. political opinion books, where we characterize trends in cited belief holders -- respected allies and opposed bogeymen -- across U.S. political ideologies.

translated by 谷歌翻译